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yfa* A A NOVEL HOMEODOMAIN PRO TEIN^ T> ' I 

FIELD OF THE INVENTION 

=i J The invention relates to novel paired homeodomain proteins, nucleic acids and 

1 1 antibodies, and to a novel method of differential reverse-transcriptase based 

• =; 1 5 polymerase chain reaction (dRT-PCR). 

J 5 BACKGROUND OF THE INVENTION 

r 2 The study of cellular diversification during neurogenesis requires markers to 

p! identify different neural cell types. Recently it has become clear that 

transcription factors can serve as useful markers of neuronal identity. For 
1 0 example, the bHLH protein MASH1 identifies autonomic progenitors in the 

PNS (Lo et al., 1994). Similarly Islet-1 (Ericson et al, 1992) and additional 
recently-characterized proteins in the Urn homeodomain family mark subsets of 
functionally-distinct motor neurons (Tsuchida et al., 1994). These data 
suggests that the diversity of neuronal cell types can be defined in terms of an 
15 underlying diversity of expression of members of a transcription factor gene 

family. 



Polymerase chain reaction (PCR)-based cloning using degenerate 
oligonucleotide primers has proven useful as a method to rapidly isolate 
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members of a gene family (see, for example, (Libert et al., 1989; Wilkie and 
Simon, 1991)). When cDNA rather than genomic DNA is used as the template, 
the method can selectively identify those members of a gene family expressed 
in a given tissue or cell type (Johnson et al., 1990; Lai and Lemke, 1991). A 
5 limitation of this approach is that one must often sequence tens if not hundreds 
of PCR products to identify novel genes, and then examine their expression 
patterns to identify those appropriate for further study. This makes it 
labor-intensive to apply this approach to several gene families simultaneously. 

Neuropeptides such as CGRP (Murphy et al., 1991) and Substance P (Ito et al., 
1 0 1993) have been used as sensory markers, but these neuropeptides are not in 

fact sensory neuron-specific: for example, they can be induced in cultured 
sympathetic neurons by some cytokines (Farm and Patterson, 1994). In the 
absence of definitive sensory neuron markers, unambiguous identification of 
sensory neurons in mammalian neural crest cultures has been difficult 
15 (Matsumoto, 1994). 

Accordingly, it is an object of the present invention to provide a marker to 
identify neurons in the peripheral sensory lineage. Accordingly, the invention 
provides recombinant DRG1 1 proteins and variants thereof, and to produce 
useful quantities of these DRG1 1 proteins using recombinant DNA techniques. 

2 0 It is a further object of the invention to provide recombinant nucleic acids 

encoding DRG1 1 proteins, and expression vectors and host cells containing the 
nucleic acid encoding the DRG1 1 protein. 

An additional object of the invention is to provide polyclonal and monoclonal 
antibodies directed against DRG1 1 proteins. 
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A further object of the invention is to provide methods for producing the 
DRG1 1 proteins. 

An additional object is to provide novel methods for the identification of gene 
family members which are specifically expressed in a given tissue or cell type. 

5 SUMMARY OF THE INVENTION 

In accordance with the objects outlined above, the present invention provides 
recombinant nucleic acids encoding a DRG1 1 protein, expression vectors 
containing the nucleic acids and host cells transformed with the expression 
vectors. 

1 0 The invention further provides methods of producing a DRG1 1 protein 

comprising culturing a host cell transformed with an expressing vector 
comprising a DGR1 1 nucleic acid and expressing the nucleic acid to produce a 
DRG1 1 protein. 

The invention additionally provides DRG1 1 proteins and antibodies. 

1 5 The invention also provides methods for determining the differential 

expression of a gene in different cell types or tissues. First, libraries of nucleic 
acids from a plurality of different cell types or tissues are synthesized using a 
set of primers. A portion of the library obtained from a first of said different 
cell types or tissue is subcloned to form a subclone library. Then, members of 

2 0 the subclone library are separately contacted with probes, each of which 

comprise one of the libraries. The nucleic acids in the libraries are labeled and 
the contacting is under conditions which permit the hybridization of the labeled 
nucleic acids to complementary nucleic acids, if present, in the subclone 



library. Next, a determination is made whether hybridization has occurred for 
each of the probes for members of the subclone library as an indication of the 
differential expression of a gene expressed by said first cell type or tissue. 



BRIEF DESCRIPTION OF THE DRAWINGS 

5 Figures 1 A and IB depict the differential reverse-transcriptase polymerase 
chain reaction (dRT-PCR) used herein. Figure 1 A is a schematic 
representation of the dRT-PCR procedure. Figure IB is an example of 
'3 differential hybridization. 

C]L_ V^V 1 Figure2 depict/ the nucleotide sequence of rat DRG1 L 
Q)^ 1 0 Figure 3 depicts the amino acid sequence of rat DRGl 



Figure 4 depicts a homology lineup of the homeodomain sequences among 
f ) paired homeodomam proteins. The sources of the sequences illustrated are: v 

A Phox2#Valarche et aL 1993); Cartl (Zhao et al., 1993); Phoxl /Grueneber/et 

U). al., 1992); S8 (Opstelten et al., 1991); Pax6 (Walther and Gruss, 1991); Dal ^ 

C^L, 15 (Schneitz et al., 1993); Dprd^Frigerio et al., 1986); Drepo ^iong et al., 1994); 

Q^___ and Cmc^4ifler?t^l 1992). 



DETAILED DESCRIPTION OF THE INVENTION 



20 



The present invention provides novel DRG1 1 proteins. In a preferred 
embodiment, the DRG1 1 proteins are from vertebrates, more preferably from 
mammals, and in the preferred embodiment, from rats or humans. However, 
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using the techniques outlined below, DRG1 1 proteins from other organisms 
may also be obtained. 

A DRG1 1 protein of the present invention may be identified in several ways. 
A DRG1 1 nucleic acid or DRG1 1 protein is initially identified by substantial 
5 nucleic acid and/or amino acid sequence homology to the sequences shown in 
Figures 2^01- 3^ Such homology can be based upon the overall nucleic acid or 
amino acid sequence. 

As used herein, a protein is a "DRG1 1 protein" if the overall homology of the 
protein sequence to the ammo acid sequence shown inTigure ^ispreferably 

1 0 greater than about 70%, more preferably greater than about 80% and most 

preferably greater than 90%. In some embodiments the homology will be as 
high as about 95 or 98%. This homology will be determined using standard 
techniques known in the art, such as the Best Fit sequence program described 
by Devereux et al, Nucl. Acid Res. 72:387-395 (1984) or the BLASTX 

15 program (Altschul et al., J. Mol. Biol. 215, 403-410). The alignment may 

include the introduction of gaps in the sequences to be aligned. In addition^ for 
sequences which contain either more or fewer amino acids thap the protein 
shown in the Figures, it is understood that the percentage of homology will be 
determined based on the number of homologous amino acids in relation to the 

2 0 total number of amino acids. Thus, for example, homology of sequences 

shorter than that shown in the Figures, as discussed below, will be determined 
using the number of amino acids in the shorter sequence. 

DRG1 1 proteins may be identified in one aspect by significant homology to the 
areas other than the homeodomain, i.e. the N- and C-terminal portions of the 
2 5 sequences depicted in the Figures. This homology is preferably greater than 

about 70%, with greater than about 80% being particularly preferred and 



# 
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greater than about 90% being especially preferred. In some cases the 
homology will be greater than about 90 to 95 or 98%. 



In addition, a DRG1 1 protein preferably also has significant homology to the 
DRG1 1 homeodomain as described herein. This homology is preferably 
5 greater than about 70%, with greater than about 80% being particularly 

preferred and greater than about 90% being especially preferred. In some cases 
the homology will be greater than about 90 to 95 or 98%. 



DRG1 1 proteins of the present invention may be shorter or longer than the 
amino acid sequences shown in the Figures. Thus, in a preferred embodiment, 
1 0 included within the definition of DRG1 1 proteins are portions or fragments of 
the sequences shown in 

DRGl 1 proteins may also be identified as being encoded by DRG1 1 nucleic 
acids. Thus, DRGl 1 proteins are encoded by nucleic acids that will hybridize 
to the sequence depicted in Figure 2^as outlined herein. 

15 In a preferred embodiment, when the DRG 1 1 protein is to be used to generate 

antibodies, the DRGl 1 protein must share at least one epitope or determinant 
with the full length protein shown in Fig^^^6p"eprtop^or "determinant" 
herein is meant a portion of a protein which will generate and/or bind an 
antibody. Thus, in most instances, antibodies made to a smaller DRGl 1 

2 0 protein will be able to bind to the full length protein. In a preferred 

embodiment, the epitope is unique; that is, antibodies generated to a unique 
epitope show little or no cross-reactivity; thus, for example, the antibodies 
preferably do not bind the homeodomain. In a preferred embodiment, the 
antibodies are generated to the N- and C-terminal portion of the DRGl 1 

25 molecule. The DRG 11 antibodies of the invention specifically bind to 
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DRG11 proteins. By "specifically bind" herein is meant that the antibodies 
bind to the protein with a binding constant in the range of at least 10 4 - 10 6 M" 
\ with a preferred range being 10 7 - 10 9 M" 1 . 

In the case of the nucleic acid, the overall homology of the nucleic acid 
5 sequence is commensurate with amino acid homology but takes into account 

the degeneracy in the genetic code and codon bias of different organisms. 
Accordingly, the nucleic acid sequence homology may be either lower or 
higher than that of the protein sequence. Thus the homology of the nucleic 
acid sequence as compared to the nucleic acid sequence ofT^llire i^*^ ' 1 ^ 
1 0 preferably greater than 60%, more preferably greater than about 65%, 

particularly greater than about 70% and most preferably greater than 80%. In 
some embodiments the homology will be as high as about 90 to 95 or 98%. 

In a preferred embodiment, a DRG1 1 nucleic acid encodes a DRG1 1 protein. 
As will be appreciated by those in the art, due to the degeneracy of the genetic 

15 code, an extremely large number of nucleic acids may be made, all of which 

encode the DRG1 1 proteins of the present invention. Thus, having identified a 
particular amino acid sequence, those skilled in the art could make any number 
of different nucleic acids, by simply modifying the sequence of one or more 
codons in a way which does not change the amino acid sequence of the 

2 0 DRG11. 

In one embodiment, the nucleic acid homology is determined through 
hybridization studies. Thus, for example, nucleic acids which hybridize under 
high stringency to the nucleic acid sequence shown in Figure 2 or their 
complements are considered DRG1 1 genes. High stringency conditions are 
2 5 known in the art; see for example Maniatis et al., Molecular Cloning: A 
Laboratory Manual, 2d Edition, 1989, and Short Protocols in Molecular 



Biology, ed. Ausubel, et al., both of which are hereby incorporated by 
reference. 

In another embodiment, less stringent hybridization conditions are used; for 
example, moderate or low stringency conditions may be used, as are known in 
5 the art; see Maniatis and Ausubel, supra. 

The DRG1 1 proteins and nucleic acids of the present invention are preferably 
recombinant. As used herein, "nucleic acid" may refer to either DNA or RNA, 
or molecules which contain both deoxy- and ribonucleotides. The nucleic acids 
include genomic DNA, cDNA and oligonucleotides including sense and anti- 
10 sense nucleic acids. Such nucleic acids may also contain modifications in the 

ribose-phosphate backbone to increase stability and half life of such molecules 
in physiological environments. 

The nucleic acid may be double stranded, single stranded, or contain portions 
of both double stranded or single stranded sequence. By the term "recombinant 

15 nucleic acid" herein is meant nucleic acid, originally formed in vitro, in 

general, by the manipulation of nucleic acid by endonucleases, in a form not 
normally found in nature. Thus an isolated DRG1 1 nucleic acid, in a linear 
form, or an expression vector formed in vitro by ligating DNA molecules that 
are not normally joined, are both considered recombinant for the purposes of 

2 0 this invention. It is understood that once a recombinant nucleic acid is made 

and reintroduced into a host cell or organism, it will replicate non- 
recombinantly, i.e. using the in vivo cellular machinery of the host cell rather 
than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still 

2 5 considered recombinant for the purposes of the invention. 



Similarly, a "recombinant protein" is a protein made using recombinant 
techniques, i.e. through the expression of a recombinant nucleic acid as 
depicted above. A recombinant protein is distinguished from naturally 
occurring protein by at least one or more characteristics. For example, the 
protein may be isolated or purified away from some or all of the proteins and 
compounds with which it is normally associated in its wild type host, and thus 
may be substantially pure. For example, an isolated protein is unaccompanied 
by at least some of the material with which it is normally associated in its 
natural state, preferably constituting at least about 0.5%, more preferably at 
least about 5% by weight of the total protein in a given sample. A substantially 
pure protein comprises at least about 75% by weight of the total protein, with at 
least about 80% being preferred, and at least about 90% being particularly 
preferred. The definition includes the production of a DRG1 1 protein from one 
organism in a different organism or host cell. Alternatively, the protein may be 
made at a significantly higher concentration than is normally seen, through the 
use of a inducible promoter or high expression promoter, such that the protein 
is made at increased concentration levels. Alternatively, the protein may be in 
a form not normally found in nature, as in the addition of an epitope tag or 
amino acid substitutions, insertions and deletions, as discussed below. 

Also included with the definition of DRG1 1 protein are other DRG1 1 proteins 
of the DRG1 1 family, and DRG1 1 proteins from other organisms, which are 
cloned and expressed as outlined below. Thus, probe or degenerate polymerase 
chain reaction (PCR) primer sequences may be used to find other related 
DRG1 1 proteins from humans or other organisms. As will be appreciated by 
those in the art, particularly useful probe and/or PCR primer sequences include 
the unique areas of the DRG1 1 nucleic acid sequence. Thus, useful probe or 
primer sequences may be designed to the N- and C-terminal portions of the 
sequence. As is generally known in the art, preferred PCR primers are from 



about 15 to about 35 nucleotides in length, with from about 20 to about 30 
being preferred, and may contain inosine as needed. The conditions for the 
PCR reaction are well known in the art. 

Once the DRG1 1 nucleic acid is identified, it can be cloned and, if necessary, 
its constituent parts recombined to form the entire DRG1 1 protein nucleic acid. 
Once isolated from its natural source, e.g., contained within a plasmid or other 
vector or excised therefrom as a linear nucleic acid segment, the recombinant 
DRG1 1 nucleic acid can be further used as a probe to identify and isolate other 
DRG1 1 nucleic acids. It can also be used as a "precursor" nucleic acid to make 
modified or variant DRG1 1 nucleic acids and proteins. 

Using the nucleic acids of the present invention which encode a DRG1 1 
protein, a variety of expression vectors are made. The expression vectors may 
be either self-replicating extrachromosomal vectors or vectors which integrate 
into a host genome. Generally, these expression vectors include transcriptional 
and translational regulatory nucleic acid operably linked to the nucleic acid 
encoding the DRG1 1 protein. "Operably linked" in this context means that the 
transcriptional and translational regulatory DNA is positioned relative to the 
coding sequence of the DRG1 1 protein in such a manner that transcription is 
initiated. Generally, this will mean that the promoter and transcriptional 
initiation or start sequences are positioned 5' to the DRG1 1 protein coding 
region. The transcriptional and translational regulatory nucleic acid will 
generally be appropriate to the host cell used to express the DRG1 1 protein; for 
example, transcriptional and translational regulatory nucleic acid sequences 
from Bacillus are preferably used to express the DRG1 1 protein in Bacillus. 
Numerous types of appropriate expression vectors, and suitable regulatory 
sequences are known in the art for a variety of host cells. 



In general, the transcriptional and translational regulatory sequences may 
include, but are not limited to, promoter sequences, ribosomal binding sites, 
transcriptional start and stop sequences, translational start and stop sequences, 
and enhancer or activator sequences. In a preferred embodiment, the regulatory 
sequences include a promoter and transcriptional start and stop sequences. 

Promoter sequences encode either constitutive or inducible promoters. The 
promoters may be either naturally occurring promoters or hybrid promoters. 
Hybrid promoters, which combine elements of more than one promoter, are 
also known in the art, and are useful in the present invention. 

In addition, the expression vector may comprise additional elements. For 
example, the expression vector may have two replication systems, thus 
allowing it to be maintained in two organisms, for example in mammalian or 
insect cells for expression and in a procaryotic host for cloning and 
amplification. Furthermore, for integrating expression vectors, the expression 
vector contains at least one sequence homologous to the host cell genome, and 
preferably two homologous sequences which flank the expression construct. 
The integrating vector may be directed to a specific locus in the host cell by 
selecting the appropriate homologous sequence for inclusion in the vector. 
Constructs for integrating vectors are well known in the art. 

In addition, in a preferred embodiment, the expression vector contains a 
selectable marker gene to allow the selection of transformed host cells. 
Selection genes are well known in the art and will vary with the host cell used. 

The DRG1 1 proteins of the present invention are produced by culturing a host 
cell transformed with an expression vector containing nucleic acid encoding a 
DRG1 1 protein, under the appropriate conditions to induce or cause expression 
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of the DRG1 1 protein. The conditions appropriate for DRG1 1 protein 
expression will vary with the choice of the expression vector and the host cell, 
and will be easily ascertained by one skilled in the art through routine 
experimentation. For example, the use of constitutive promoters in the 
expression vector will require optimizing the growth and proliferation of the 
host cell, while the use of an inducible promoter requires the appropriate 
growth conditions for induction. In addition, in some embodiments, the timing 
of the harvest is important. For example, the baculo viral systems used in insect 
cell expression are lytic viruses, and thus harvest time selection can be crucial 
for product yield. 

Appropriate host cells include yeast, bacteria, archebacteria, fungi, and insect 
and animal cells, including mammalian cells. Of particular interest are 
Drosophila melangaster cells, Saccharomyces cerevisiae and other yeasts, E. 
coli, Bacillus subtilis, SF9 cells, CI 29 cells, 293 cells, Neurospora, BHK, 
CHO, COS, and HeLa cells y £>A¥f D, WIIAT A RE P RE F ERRED IIOST ^ 
CELLS? - 

In a preferred embodiment, the DRG1 1 proteins are expressed in mammalian 
cells. Mammalian expression systems are also known in the art. A mammalian 
promoter is any DNA sequence capable of binding mammalian RNA 
polymerase and initiating the downstream (3') transcription of a coding 
sequence for DRG1 1 protein into mRNA. A promoter will have a transcription 
initiating region, which is usually placed proximal to the 5' end of the coding 
sequence, and a TATA box, using a located 25-30 base pairs upstream of the 
transcription initiation site. The TATA box is thought to direct RNA 
polymerase II to begin RNA synthesis at the correct site. A mammalian 
promoter will also contain an upstream promoter element (enhancer element), 
typically located within 100 to 200 base pairs upstream of the TATA box. An 
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upstream promoter element determines the rate at which transcription is 
initiated and can act in either orientation. Of particular use as mammalian 
promoters are the promoters from mammalian viral genes, since the viral genes 
are often highly expressed and have a broad host range. Examples include the 
SV40 early promoter, mouse mammary tumor virus LTR promoter, adenovirus 
major late promoter, herpes simplex virus promoter, and the CMV promoter. 

Typically, transcription termination and polyadenylation sequences recognized 
by mammalian cells are regulatory regions located 3' to the translation stop 
codon and thus, together with the promoter elements, flank the coding 
sequence. The 3' terminus of the mature mRNA is formed by site-specific 
post-translational cleavage and polyadenylation. Examples of transcription 
terminator and polyadenlytion signals include those derived form SV40. 

The methods of introducing exogenous nucleic acid into mammalian hosts, as 
well as other hosts, is well known in the art, and will vary with the host cell 
used. Techniques include the use of viruses such as retroviruses and 
adenoviruses, dextran-mediated transfection, calcium phosphate precipitation, 
polybrene mediated transfection, protoplast fusion, electroporation, 
encapsulation of the polynucleotide(s) in liposomes, and direct microinjection 
of the DNA into nuclei. 

In a preferred embodiment, DRG1 1 proteins are expressed in bacterial systems. 
Bacterial expression systems are well known in the art. 

A suitable bacterial promoter is any nucleic acid sequence capable of binding 
bacterial RNA polymerase and initiating the downstream (3') transcription of 
the coding sequence of DRG11 protein into mRNA. A bacterial promoter has a 
transcription initiation region which is usually placed proximal to the 5' end of 
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the coding sequence. This transcription initiation region typically includes an 
RNA polymerase binding site and a transcription initiation site. Sequences 
encoding metabolic pathway enzymes provide particularly useful promoter 
sequences. Examples include promoter sequences derived from sugar 
metabolizing enzymes, such as galactose, lactose and maltose, and sequences 
derived from biosynthetic enzymes such as tryptophan. Promoters from 
bacteriophage may also be used and are known in the art. In addition, synthetic 
promoters and hybrid promoters are also useful; for example, the tac promoter 
is a hybrid of the trp and lac promoter sequences. Furthermore, a bacterial 
promoter can include naturally occurring promoters of non-bacterial origin that 
have the ability to bind bacterial RNA polymerase and initiate transcription. 

In addition to a functioning promoter sequence, an efficient ribosome binding 
site is desirable. In E. coli, the ribosome binding site is called the Shine- 
Delgarno (SD) sequence and includes an initiation codon and a sequence 3-9 
nucleotides in length located 3-11 nucleotides upstream of the initiation 
codon. 

The expression vector may also include a signal peptide sequence that provides 
for secretion of the DRG1 1 protein in bacteria. The signal sequence typically 
encodes a signal peptide comprised of hydrophobic amino acids which direct 
the secretion of the protein from the cell, as is well known in the art. The 
protein is either secreted into the growth media (gram-positive bacteria) or into 
the periplasmic space, located between the inner and outer membrane of the 
cell (gram-negative bacteria). 

The bacterial expression vector may also include a selectable marker gene to 
allow for the selection of bacterial strains that have been transformed. Suitable 
selection genes include genes which render the bacteria resistant to drugs such 



as ampicillin, chloramphenicol, erythromycin, kanamycin, neomycin and 
tetracycline. Selectable markers also include biosynthetic genes, such as those 
in the histidine, tryptophan and leucine biosynthetic pathways. 

These components are assembled into expression vectors. Expression vectors 
for bacteria are well known in the art, and include vectors for Bacillus subtilis, 
E. coli, Streptococcus cremoris, and Streptococcus lividans, among others. 

The bacterial expression vectors are transformed into bacterial host cells using 
techniques well known in the art, such as calcium chloride treatment, 
electroporation, and others. 

In one embodiment, DRG1 1 proteins are produced in insect cells. Expression 
vectors for the transformation of insect cells, and in particular, baculovirus- 
based expression vectors, are well known in the art. 

In a preferred embodiment, DRG1 1 protein is produced in yeast cells. Yeast 
expression systems are well known in the art, and include expression vectors 
for Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula 
polymorpha, Kluyveromyces fragilis and K. lactis, Pichia guillerimondii and P. 
pastoris, Schizosaccharomyces pombe, and Yarrowia lipolytica. Preferred 
promoter sequences for expression in yeast include the inducible GAL 1,10 
promoter, the promoters from alcohol dehydrogenase, enolase, glucokinase, 
glucose-6-phosphate isomerase, glyceraldehyde-3 -phosphate-dehy drogenase, 
hexokinase, phosphofructokinase, 3 -phosphogly cerate mutase, pyruvate kinase, 
and the acid phosphatase gene. Yeast selectable markers include ADE2, HIS4, 
LEU2, TRP1, and ALG7, which confers resistance to tunicamycin; the 
neomycin phosphotransferase gene, which confers resistance to G41 8; and the 
CUP1 gene, which allows yeast to grow in the presence of copper ions. 



The DRG1 1 protein may also be made as a fusion protein, using techniques 
well known in the art. Thus, for example, for the creation of monoclonal 
antibodies, if the desired epitope is small, the DRG1 1 protein may be fused to a 
carrier protein to form an immunogen. Alternatively, the DRG1 1 protein may 
be made as a fusion protein to increase expression, or for other reasons. 

Also included within the definition of DRG1 1 proteins of the present invention 
are amino acid sequence variants. These variants fall into one or more of three 
classes: substitutional, insertional or deletional variants. These variants 
ordinarily are prepared by site specific mutagenesis of nucleotides in the DNA 
encoding the DRG1 1 protein, using cassette or PCR mutagenesis or other 
techniques well known in the art, to produce DNA encoding the variant, and 
thereafter expressing the DNA in recombinant cell culture as outlined above. 
However, variant DRG1 1 protein fragments having up to about 100-150 
residues may be prepared by in vitro synthesis using established techniques. 
Amino acid sequence variants are characterized by the predetermined nature of 
the variation, a feature that sets them apart from naturally occurring allelic or 
interspecies variation of the DRG1 1 protein amino acid sequence. The variants 
typically exhibit the same qualitative biological activity as the naturally 
occurring analogue, although variants can also be selected which have modified 
characteristics as will be more fully outlined below. 

While the site or region for introducing an amino acid sequence variation is 
predetermined, the mutation per se need not be predetermined. For example, in 
order to optimize the performance of a mutation at a given site, random 
mutagenesis may be conducted at the target codon or region and the expressed 
DRG1 1 variants screened for the optimal combination of desired activity. 
Techniques for making substitution mutations at predetermined sites in DNA 
having a known sequence are well known, for example, Ml 3 primer 
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mutagenesis and PCR mutagenesis. Screening of the mutants is done using 
assays of DRG1 1 protein activities. 

Amino acid substitutions are typically of single residues; insertions usually will 
be on the order of from about 1 to 20 amino acids, although considerably larger 
5 insertions may be tolerated. Deletions range from about 1 to about 20 residues, 
although in some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to 
arrive at a final derivative. Generally these changes are done on a few amino 
acids to minimize the alteration of the molecule. However, larger changes may 
10 be tolerated in certain circumstances. When small alterations in the 

characteristics of the DRG1 1 protein are desired, substitutions are generally 
made in accordance with the following chart: 
Original Residue Exemplary Substitutions 
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Trp, Phe 
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lie, Leu 
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Substantial changes in function or immunological identity are made by 
selecting substitutions that are less conservative than those shown in Chart I. 
For example, substitutions may be made which more significantly affect: the 
structure of the polypeptide backbone in the area of the alteration, for example 
the alpha-helical or beta-sheet structure; the charge or hydrophobicity of the 
molecule at the target site; or the bulk of the side chain. The substitutions 
which in general are expected to produce the greatest changes in the 
polypeptide's properties are those in which (a) a hydrophilic residue, e.g. seryl 
or threonyl, is substituted for (or by) a hydrophobic residue, e.g. leucyl, 
isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline is substituted 
for (or by) any other residue; (c) a residue having an electropositive side chain, 
e.g. lysyl, arginyl, or histidyl, is substituted for (or by) an electronegative 
residue, e.g. glutamyl or aspartyl; or (d) a residue having a bulky side chain, 
e.g. phenylalanine, is substituted for (or by) one not having a side chain, e.g. 
glycine. 

The variants typically exhibit the same qualitative biological activity and will 
elicit the same immune response as the naturally-occurring analogue, although 
variants also are selected to modify the characteristics of the DRG1 1 proteins 
as needed. Alternatively, the variant may be designed such that the biological 
activity of the DRG1 1 protein is altered. 

In one embodiment, homeodomain variants are made. In one embodiment a 
homeodomain may be eliminated entirely. Alternatively, any or all of the 
amino acids of a homeodomain may be be altered or deleted. In a preferred 
embodiment, one or more of the amino acids of the homeodomain are 
substituted by other amino acids. Thus, amino acids corresponding to the rat 
DRG1 1 homeodomain residues, depicted in eithir^igwe ^ orVmay^e^aT^^d.^ 
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In addition, as outlined in the Examples, the homophilic DRG1 1 1 cellular 
adhesion is dependent on the presence of divalent cations; thus, the metal 
binding properties of the binding domain may be altered. 

In one embodiment, the DRG11 nucleic acids, proteins and antibodies of the 
5 invention are labelled. By "labelled" herein is meant that a compound has at 
least one element, isotope or chemical compound attached to enable the 
detection of the compound. In general, labels fall into three classes: a) 
isotopic labels, which may be radioactive or heavy isotopes; b) immune 
labels, which may be antibodies or antigens; and c) colored or fluorescent 
10 dyes. The labels may be incorporated into the compound at any position. 

In a preferred embodiment, the DRG1 1 protein is purified or isolated after 
expression. DRG1 1 proteins may be isolated or purified in a variety of ways 
known to those skilled in the art depending on what other components are 
present in the sample. Standard purification methods include electrophoretic, 

15 molecular, immunological and chromatographic techniques, including ion 

exchange, hydrophobic, affinity, and reverse-phase HPLC chromatography, 
and chromatofocusing. For example, the DRG1 1 protein may be purified using 
a standard anti-DRGl 1 antibody column. Ultrafiltration and diafiltration 
techniques, in conjunction with protein concentration, are also useful. For 

2 0 general guidance in suitable purification techniques, see Scopes, R., Protein 

Purification, Springer- Verlag, NY (1982). The degree of purification necessary 
will vary depending on the use of the DRG1 1 protein. In some instances no 
purification will be necessary. 

Once expressed and purified if necessary, the DRG1 1 proteins are useful in a 
2 5 number of applications. 
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In a preferred embodiment, the DRG1 1 gene, protein or antibody is used as a 
marker for detecting the presence or absence of sensory neurons. As outlined 
herein, DRG1 1 is specifically expressed in sensory neurons but not in 
autonomic neurons or glia. Unexpectedly, DRG1 1 is also expressed in a subset 
5 the known functional targets of sensory neurons in the spinal cord. Therefore, 

this putative transcription factor can be used as a molecular marker to identify 
neurons in the peripheral sensory lineage. All or part of the sequence can be 
used, although preferably unique sequences are used, as outlined herein. 

Since the pattern of expression of DRG1 1 in the central nervous system 
1 0 suggests that DRG1 1 may function in regulating some aspect of the 

connectivity between these neurons and their central targets, DRG1 1 proteins, 
nucleic acids and antibodies are useful in the screening and diagnosis of the 
presence or absence of DRG1 1 in a variety of cell types. 

The DRG1 1 protein is also useful in screens to identify antagonists and 
1 5 agonists of DRG 1 1 , as will be appreciated by those in the art. 

In one embodiment, the DRG1 1 proteins of the present invention may be used 
to generate polyclonal and monoclonal antibodies to DRG1 1 proteins, which 
are useful as described below. Similarly, the DRG1 1 proteins can be coupled, 
using standard technology, to affinity chromatography columns. These 

2 0 columns may then be used to purify DRG1 1 antibodies. In a preferred 

embodiment, the antibodies are generated to epitopes unique to the DRG1 1 
protein; that is, the antibodies show little or no cross-reactivity to other 
proteins. These antibodies find use in a number of applications. For example, 
the DRG1 1 antibodies may be coupled to standard affinity chromatography 

2 5 columns and used to purify DRG1 1 proteins. 



The presence or absence or DRG1 1 may be assayed or detected using labelled 
DRG1 1 proteins, antibodies or nucleic acids. For example, methods are 
provided for detecting a DRG1 1 protein in a target sample comprising 
contacting a labelled polypeptide such as an antibody which will specifically 
bind to a DRG1 1 protein with the target sample and assaying for the presence 
of binding between the labelled polypeptide and DRG1 1 , if present, in the 
target sample. The contacting is done under conditions which allow binding to 
DRG11. 

The present invention also provides novel methods for differential reverse- 
transcriptase-based polymerase chain reaction (dRT-PCR). In a preferred 
embodiment, dRT-PCR allows the identification of members of a gene family 
which are specifically expressed in a given tissue or cell type. This method 
allows the determination of the differential expression of a gene in different 
cell types or tissues. The method first synthesizes libraries of nucleic acids 
from a plurality of different cell types or tissues using a set of primers. In a 
preferred embodiment, the synthesis is done using polymerase chain reaction 
(PCR), as is known in the art. 

In a preferred embodiment, the primers are degenerate oligonucleotide primers 
flanking or within a conserved domain from a given gene family, although this 
is not required. Preferably, the primers are generated to regions within a 
conserved domain from a family or superfamily of genes known in the art. 
Preferred conserved domains include, but are not limited to, transcription factor 
or DNA binding gene families such as homeodomain regions, Ets-domain (Nye 
et al., 1992), forkhead (Lai et al., 1991), scalloped/TEF (Campbell et al., 1992), 
leucine zipper, and zinc finger domains. 
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A portion of the library obtained from at least one, i.e. the first, different cell 
types or tissue is then subcloned to form a subclone library. The first different 
cell type is generally the one in which information regarding differential 
expression is sought, i.e. that contains the gene(s) of interest. 

Once subcloned, members of the subclone library are probed. The probes are 
labelled nucleic acids derived from the above described libraries from the first 
step. The conditions for probe hybridization (or contact) are chosen under 
conditions which permit the hybridization of the labeled nucleic acids to 
complementary nucleic acids, if present, in the subclone library. 

The patterns of hybridization using the different probes on the subclone library 
allow the determination of the differential expression of different genes, as 
outlined in the Examples, thus serving as an indication of the differential 
expression of a gene expressed by the first cell type or tissue. 

The following examples serve to more fully describe the manner of using the 
above-described invention, as well as to set forth the best modes contemplated 
for carrying out various aspects of the invention. It is understood that these 
examples in no way serve to limit the true scope of this invention, but rather are 
presented for illustrative purposes. All references cited herein are incorporated 
by reference. 
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EXAMPLE 



Identification of DRG1 1 by dRT-PCR 



A schematic outline of the differential RT-PCR (dRT-PCR) procedure is 
illustrated in Figure 1 A. cDNA from two or more cell types or tissues is 
amplified using degenerate oligonucleotide primers flanking a conserved 
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domain from a given gene family. The PCR products from the "positive" 
tissue of interest ("A-cell" in Fig. 1A) are subcloned, and the colonies are 
grown in a 12 x 8 array in a microtiter plate. Multiple replicas of this array are 
prepared by spotting small aliquots of each liquid bacterial culture onto nylon 
filters (Wilkie and Simon, 1991). These filters are then annealed with [ 32 P] 
probes made from the same RT-PCR products, as well as from the RT-PCR 
products derived from one or more "negative" tissues ("B, C, D...etc.-cells" in 
Fig. 1 A). Clones displaying differential hybridization (A+B-CTT) are picked 
for sequencing. 

Molecular cloning was performed according to standard procedures (Sambrook 
et al., 1989), with minor modifications. Total RNA was prepared by acid 
guanidinum thiocyanate method as described (Chomczynski and Sacchi, 1987) 
with slight modification. cDNA was synthesized from 2 of total RNA in 50 
lA by using random hexamer primers and reverse transcriptase, and 1 fA of this 
reaction mixture was used for PCR. Degenerate oligonucleotide primers 
corresponding to the sequence coding for amino acids FTAYQLE^and&e 



complementary sequence coding for amino acids QVWFQNR(N-terminal and 
C-terminal portions of the paired type homeodomain from the Drosophila 
protein RK2/repo (Campbell et al., 1994; Xiong et al., 1994 )^^ e 1 g e ^ 0 / 
PCR: CGGGATCCTT(TC)ACIGCITA(TC)CA(GA)(TC)TIGAarid ' ' ' 



CGGAATTC(GT)(GA)TT(TC)TG(GA)AACCAIAC(TC)T(^ DNA was 
amplified by Taq DNA polymerase under the following conditions: 5 reduced 
stringency cycles using: 94°C for 1 min, 42°C for 1 min, 55°C for 1 min, and 
72°C for 1 min; followed by 33 cycles using: 94°C for 1 min, 55°C for 1 min, 
and 72°C for 1 min. PCR products were fractionated by agarose gels and DNA 
fragments of the expected size (-150 bp) were purified using the Mermaid kit 
(Biol 01). Purified DNA fragments were reamplified using 28 cycles of 94°C 
for 1 min, 55°C for 1 min, and 72°C for 1 min, purified by phenol extraction, 
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and digested by EcoRI and BamHI. Digested DNA fragments with a size of 
~150 bp were purified from agarose gels (see above). An aliquot of the 
digested DNA fragments was ligated into pBluescript for subcioning and 
transformed into E. coli. Another aliquot of these DNA fragments was saved 

5 for use as a probe in differential hybridization. Each transformant was 

transferred into 70 of LB-amp medium in a well of a 96-well plate and 
cultured at 37°C for 14 hr with shaking. 5 (A of the liquid culture was spotted 
onto replica filters (Genescreen, Dupont) using a multichannel pipettor. The 
filter was treated according to the manufacturer's protocol and hybridized with 

0 a 32 P-labelled probe prepared by random-primed labelling of the other aliquot 

of the restriction-digested PCR products. Nucleotide sequences of the clones 
which showed differential distribution were determined. 

Southern blotting of PCR products 

PCR products of RT-PCR were fractionated by agarose gel electrophoresis, 
5 transferred onto replicate Genescreen filters and hybridized with 32 P-labelled 

probes derived from the DNA fragments encoding the homeodomains of 
DRG 1 1 , Pax3 or NCM3 . As a control for the amount of amplified cDNA in 
the gel blot experiments, parallel aliquots were amplified for B-actin using 
gene-specific primers. The PCR product corresponded to a 3 10 bp fragment 
0 spanning nucleotides 767- 1 077 of the rat J3-actin mRNA, and was amplified 

with the following primers: TCATGAAGTGTGACGTTGACATCCand 
GTAAAACGCAGCTCAGTAACAGTC^ Conditions were 20 cycles of: 94°C 
for 1 min., 60°C for 1 min., 72°C for 1 min. 

For simplicity, examples of only four clones (Fig. IB, columns 1-4) derived 
5 from paired homeodomain (PHD)-primer amplification of embryonic day 13.5 

(El 3. 5) rat DRG cDNA, annealed in quadruplicate with four different probes, 
are illustrated. The probes used were PHD-amplified PCR products from: 
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DRG, NCM-1 cells (a glial progenitor cell line (Lo et al., 1990)), MAH cells (a 
sympathoadrenal progenitor cell line (Birren and Anderson, 1990)) and Rat-1 
fibroblasts (a non-neural cell line). Two pieces of information are derived from 
this screen. First, the relative intensities of the hybridization signals exhibited 
5 by a single clone annealed with multiple probes gives an indication of its 

relative abundance within its gene family in different cells or tissues (Fig. IB, 
read vertically). Second, the relative intensities of the hybridization signals 
exhibited by multiple clones annealed with a single probe provides an 
indication of their relative abundance within that cell or tissue type (Fig. IB, 
10 read horizontally). Taken together, such information provides a unique 

"fingerprint" for each clone. Such fingerprint data allows a preliminary 
assessment of the number of distinct differentially-expressed sequences within 
an array of clones. 

In the example shown in Fig. IB, clone 1 is strongly expressed in DRG, weakly 
15 in NCM-1 cells, and not above background in MAH and Rat-1 cells 

(background is defined by the weakest hybridization signal observed on a clone 
array with a given probe); this clone was later identified as Pax3 (Goulding et 
al., 1991). Clones 2 and 3 are expressed strongly in DRG, and not above 
background in NCM-1, MAH and Rat-1 cells; this clone was subsequently 
2 0 identified as a novel PHD protein which we have called DRG1 1 . Clone 4 is 

expressed in DRG and NCM-1 cells, and weakly in Rat-1 cells. 

To further confirm the specificity of expression of the clones identified in the 
initial differential screen, the hybridization procedure was reversed: inserts 
from three of the clones (DRG1 1, Pax3 and NCM3) were used as hybridization 
2 5 probes on gel blots of total PHD RT-PCR products from various cell lines, or 

tissues at different stages (data not shown). Consistent with the clone blot data 
(Fig. IB), the DRG1 1 probe annealed to a single band in PHD RT-PCR 
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products from DRG, but not in MAH, Rati or NCM-1 cells. By this analysis, 
DRG 1 1 mRNA appears to be expressed in sensory ganglia but not in cell lines 
representing sympathetic neuron or glial precursors, or in fibroblasts. As 
expected from the clone blot analysis, hybridization patterns distinct from that 
5 of DRG1 1 were obtained with the Pax3 and NCM3 probes. 

This dRT-PCR method has several advantages over conventional brute-force 
RT-PCR using degenerate primers. First and foremost, only those RT-PCR 
products that display differential expression between cell types or tissues are 
selected for sequencing. This greatly reduces the number of PCR products that 

1 0 must be sequenced for any set of degenerate primers to identify interesting 

genes. Second, the intensity of the hybridization signals exhibited by a given 
clone annealed with multiple probes, as well as compared to other clones 
annealed with the same probes, provides a unique "fingerprint" for that clone. 
In our experience, clones displaying similar fingerprints usually contain the 

15 same insert sequence. Thus, by selecting for sequencing only those clones 

which exhibit different hybridization fingerprints, the characterization of 
redundant inserts is reduced. Third, although sequencing of inserts from a PCR 
experiment may no longer be rate-limiting due to automation, the analysis of 
the expression pattern of each of the PCR products still represents a major 

2 0 bottleneck. dRT-PCR reduces the number of genes whose expression patterns 

have to be characterized by identifying differentially-expressed sequences at an 
early stage in the procedure. These three factors combined allow a relatively 
rapid assessment of whether a set of degenerate primers will identify gene 
family members differentially-expressed among a set of tissues or cell types. 

2 5 This in turn permits rapid screening of multiple pairs of degenerate primers, 

either within a single gene family or among several different families. Similar 
approaches related to dRT-PCR have been described previously by others 
(Boehm, 1993; Lai and Lemke, 1991; Wilkie and Simon, 1991). 
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An unexpected problem is that the in vivo expression pattern of several 
differentially-expressed dRT-PCR products was inconsistent with expectations 
based on their differential distribution in the original screen (T. Saito, L. 
Sommer and D .Anderson, unpublished observations). In some cases, a gene 
5 was not detectably expressed in the expected tissue but rather was highly 
expressed elsewhere. Similar situations were encountered in a screen of 
receptor tyrosine phosphatases (L. Sommer and D. Anderson, unpublished 
data). 

The reason(s) for this discrepancy are not clear. It may reflect aberrant 

1 0 expression of genes in some of the cell lines used as sources of cDNA. 

Alternatively, in cases where dissected tissues were used it could reflect the 
efficient amplification of sequences present in minor or contaminating cell 
types. These are biological problems rather than problems with the dRT-PCR 
method per se, but they illustrate the importance of judiciously selecting 

15 multiple tissues and/or cell lines for the initial dRT-PCR screen. Another 

potential explanation is that some genes may be preferentially amplified by a 
specific primer set, which would be a problem intrinsic to the PCR method 
rather than to dRT-PCR. A final possibility, specific to dRT-PCR, derives 
from the fact that hybridization signal intensity in the clone blots actually 

2 0 reflects the relative abundance of that sequence within the gene family of 

interest, rather than in the cDNA population as a whole. In the limiting case 
where a given cell type expresses only a single family member, that sequence 
would represent 100% of the hybridization probe (and therefore yield an 
intense hybridization signal from that cell type) even if it is actually expressed 

25 at very low levels in that cell or tissue. Such artifacts can be minimized by 
performing RNase protection experiments for individual clones on different 
tissues. In any case, the pattern of specificity exhibited by the dRT-PCR 
procedure should be considered tentative until confirmed by in situ 
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hybridization data. This in turn raises another problem, in that the short 
(typically < 250 bp) PCR products provide insufficiently sensitive probes for 
the nonradioactive in situ hybridization procedure, necessitating additional time 
and effort to isolate longer cDNAs. The value of this method would be 
5 improved if sensitive and reliable non-radioactive in situ hybridization 

procedures could be performed directly with the PCR products. Nevertheless, 
the prescreening of clones in dRT-PCR still reduces the number of genes that 
have to be characterized by these more labor-intensive procedures. In 
summary, our data support the idea that transcription factors provide useful and 

1 0 specific markers for subtypes of neurons in the vertebrate nervous system. We 

have developed and applied a method that combines degenerate PCR 
amplification with differential hybridization to allow rapid screening of gene 
families for members which are differentially expressed among populations of 
neural cell types. This screen has yielded a hovel paired homeodomain protein, 

1 5 DRG1 1 , that represents the first sensory neuron-specific transcription factor 

identified in mammals. 



cDNA library construction and screening Poly(A+) RNA was purified from 
1 ptg of total RNA from El 3. 5 rat dorsal root ganglion (DRG) by using 
oligo(dT) magnetic beads (Dynal), and converted to cDNA using the 

2 0 Superscript choice system (Gibco BRL). The cDNA was ligated to a 
pre-annealed mixture of oligonucleotides ^ 
ACTGAAGCCAAGGTAGGATCCG A and (phosphorylated) 
CGGATCCTACCTTGGCTTCAGTAGr^ The ligated cDNA was purified using 
the Spinbind PCR purification system (FMC) and amplified by 3 rounds of 

2 5 PCR (16 cycles, 12 cycles and 9 cycles) using a phosphorylated 

oligonucleotide (CTACTGAAGCCAAGGTAGGATCCG^under the condition 
of 95°C for 1 .5 min, 64°C for 2 min, and 72°C for 7 min. After each round of 
PCR, a small aliquot of the amplified DNA (one tenth of the total reaction 



mixture) was used for the next round of PCR. 32 P-dCTP was added in the third 
round of PCR to calculate the amount of amplified DNA. ~1 .3 /u.g of DNA was 
obtained after the third round of PCR. cDNA fragments longer than 500 bp 
were separated on a size-fractionation column (Gibco BRL) and cloned into the 
lambda ZapII vector (Stratagene). 



The library was screened using a DNA probe containing the homeodomain of 
the DRG1 1 RT-PCR product. Two cDNA clones containing 2.4 kb and 1.0 kb 
inserts were obtained. Nucleotide sequences were determined on both strands 
using Sequenase (USB). The sequence of DRG1 1 has been submitted to 
GenBank under accession number U291 74. 



The deduced amino acid sequence of the longest cDNA (2.4 kb) obtained is 
presented in Figure 3^. The clone encodes a novel 28.6 Kd protein in the PHD 
family. As the 280 bp sequence upstream of the most N-terminal methionine 
contains an in-frame translational termination codon, this methionine has been 
tentatively assigned as the initiation codon. Attempts to determine the size of 
native DRG1 1 mRNA by Northern blotting of either total or polyA(+) RNA 
from rat embryos were unsuccessful, most likely due to the low abundance of 
the transcript. Moreover, the hamster anti-rat DRG1 1 monoclonal antibodies 
did not work well in Western blotting experiments; therefore the size of the 
native DRG1 1 protein has not been established. Thus, it cannot be ruled out 
that this protein is larger than that predicted by the deduced amino acid 
sequence from the cDNA clone. 



DRG1 1 is closely related in sequence to Pax3 as well as to several other paired 
homeodomain proteins (Fig. 4^ Unlike the Pax genes, however, DRG1 1 lacks 
a paired domain (Fig. 4\ Interestingly, DRG1 1 has a Gin residue instead of a 
Ser at position 9 of the recognition helix (residue 56in Fig. 4^a substitution 
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of Gin for Ser at this position also occurs in several other family members that 
r^j lack a paired domain (e.g., Phox2 and Cartl in Fig. 4V_ 

a\ 

In Situ hybridization Non-radioactive in situ hybridization was performed as 
described previously (Birren et al., 1993). Following probes were used: 
5 SCG1 0 (Stein et al., 1 988); trk A, from pDM97 (the gift of L. Parada); trkB, 

from pFRK16; trkC, from pRtrkc8 (the gifts of G. Yancopoulos). Islet- 1 and 
PAX3 clones, which carry 0.9 and 1.2 kb of cDNA respectively, were obtained 
by RT-PCR. 

The earliest stage at which DRG1 1 mRNA could be detected was at E12.5, 1-2 
10 days following the initial condensation of neural crest cells to form dorsal root 
ganglia. Expression at this age was restricted to the nervous system, and within 
the trunk region was restricted to the dorsal root ganglia; no expression was 
detected in the neural tube (data not shown). By contrast, other transcription 
factors expressed in sensory neurons at this stage, such as Isl-1 and Pax3 are 
15 also expressed in the neural tube. 

By El 5.5, DRG1 1 expression was strong in trunk DRG (data not shown) and 
was also detected in the dorsal spinal cord. Positive cells were located both 
laterally near the dorsal root entry zone, and more medially near the ventricular 
zone. Between these two zones, there was a region containing only scattered 

2 0 DRG1 1 -positive cells, although this region does contain neurons as shown by 

SCG10 hybridization. This pattern suggests that DRG1 1 is expressed by cells 
which display the known migration pattern of newly-born dorsal horn neurons 
(Langman and Haden, 1970): these cells are initially generated in the 
ventricular zone, then migrate laterally through an intermediate region to take 

2 5 up their final position at the dorso-lateral margins of the spinal cord, where 

they create a second region of high DRG1 1 expression. The DRG1 1 expression 
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pattern is distinct from that of a related PHD protein, Pax3 (Goulding et al., 
1991), which is also expressed in the dorsal spinal cord at this stage but only in 
the ventricular zone. 

Although DRG1 1 was detected in sensory but not sympathetic ganglia (not 
5 shown) at El 5. 5, this apparent specificity could simply reflect the timing of 

differentiation in the two groups of neurons; sympathetic development is 
known to lag behind sensory development by several days. For example, trkA 
mRNA is initially expressed in sensory but not sympathetic neurons at El 2.5 
(Martin-Zanca et al., 1990), but is later detected in sympathetic ganglia 

10 beginning on E16.5-E17.5 (Birren et al., 1993). To determine whether this is 

also true for DRG1 1, the expression of DRG1 1 was examined in sections 
through the anterior trunk region of El 7.5 embryos. These sections contain 
large sympathetic ganglia expressing SCG1 0, trkA and Isl-1 , but not DRG1 1 
(data not shown). This specificity of DRG1 1 expression in the PNS is also 

15 maintained at postnatal day 3 (not shown). At El 7.5, DRG1 1 expression in the 
spinal cord appeared increased in the dorsal horns relative to El 5.5. 
Interestingly, this region of the spinal cord receives synaptic input from the 
DRG (Kandel et al., 1991). Examination at high magnification of sections 
hybridized with the DRG1 1 cRNA probe revealed that the DRG1 T cells have a 

2 0 process-bearing morphology, suggesting that they are neurons (not shown). 

These data suggest that DRG1 1 is expressed both by sensory neurons in the 
DRG and by a subset of their synaptic target neurons in the dorsal spinal cord. 
At no time, however, was DRG1 1 expression detected in the ventral spinal cord 
which also receives sensory innervation. No DRG1 1 expression was detected 

2 5 outside of the nervous system at any of the stages examined. 

The expression of DRG 1 1 in sensory ganglia as well as in a subset of their 
CNS targets raised the question of whether DRG1 1 is expressed only by those 
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sensory neurons that project to the dorsal spinal cord. These neurons include 
the nociceptive, NGF-dependent subset (Ruit et al., 1992) which expresses 
trkA. These neurons can be distinguished from other sensory neurons which 
express trkC, are NT-3 -dependent and project to the ventral spinal cord, many 
5 of which are proprioceptive (for review, see (Snider, 1994)). The expression of 

DR.G1 1 in El 3. 5 trigeminal sensory ganglia as well as in trunk DRG appeared 
broader than that of trks A, B or C in nearby sections, however (data not 
shown). Rather, the extent of of DRG1 1 expression was similar to that of 
SCG10 (Stein et al., 1988) or Isl-1, two markers which label all neurons in 

1 0 these sensory ganglia. In addition, DRG1 1 expression was detected in both 

small and large DRG neurons at El 7.5; in contrast the larger neurons expressed 
trksB and C but not A (data not shown). This suggests that, at least at this 
stage, DRG1 1 expression includes, but is not restricted to, the NGF-responsive 
subset of sensory neurons. That such neurons do express DRG1 1, however, is 

15 supported by double-label immunocytochemical labeling experiments (see 

below). 

Antibody production DRG1 1 protein expressed in bacteria was gel-purified 
and used to immunize Armenian hamsters. Hamster spleen cells were fused 
with P3X63Ag8u. 1 mouse myeloma cells. Supernatants were first screened on 
2 0 dot blots of recombinant DRG1 1 protein, and positives from this screen were 

rescreened by immunofluorescence staining of CHO cells transfected with a 
mammalian DRG1 1 expression construct. 

To determine whether DRG1 1 expression within ganglia is restricted to 
neurons or is common to neurons and nonneuronal (glial) cells, its expression 
2 5 was examined in dissociated postnatal DRG cultures using the monoclonal 

antibody. Staining of perinatal rat dissociated DRG cultures revealed nuclear 
immunoreactivity in many sensory neurons, but not in glia. Approximately 
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60% of the neurons were labeled by the antibody under these conditions. 
Whether the apparent lack of DRG1 1 expression in other sensory neurons 
reflects distinct sensory lineages, or rather environmental modulation in 
culture, is currently being investigated. As an additional control for the 
5 specificity of the antibody, similar dissociated cultures of superior cervical 

sympathetic ganglia (SCG) were stained. No labeling of any cells was 
detected. Thus, as predicted by its initial selection in the dRT-PCR screen 
based on expression in DRG but not in MAH or NCM-1 cell cDNA, DRG1 1 is 
expressed in many sensory neurons but not in sympathetic neurons or glial 
10 cells. 

The availability of a monoclonal antibody to DRG1 1 allowed a preliminary 
assessment of whether this transcription factor is indeed expressed by the 
trkA-expressing subset of DRG sensory neurons, as suggested by the in situ 
hybridization data. To address this question, dissociated cultures of El 6.5 

15 DRG were double-labeled with anti-DRGl 1 and a specific polyclonal 

antiserum to trkA (Clary et al., 1994). Many trkA + neurons co-expressed 
DRG1 1 in their nuclei (data not shown). Conversely, it appeared that the 
majority of DRG1 1 + cells also expressed trkA, although a few DRG1 1 + trkA" 
cells could be observed. Finally, some neurons in the cultures expressed 

2 0 neither marker. These data confirm that DRG1 1 is expressed by 

NGF-responsive DRG sensory neurons, many of which are nociceptive neurons 
that project to the dorsal horn of the spinal cord (Snider, 1994) where DRG1 1 
is also expressed. However, DRG1 1 expression is also detected in some 
sensory neurons that do not express trkA, consistent with the in situ 

2 5 hybridization data. 

The timing of DRG 1 1 expression, 1-2 days after neurons are first detected in 
the DRG by expression of pan-neuronal markers such as SCG10, suggests that 
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this putative transcriptional regulator is unlikey to be required for initial 
neuronal differentiation. Rather, this protein is likely to regulate 
later-developing aspects of sensory neuron phenotype or function. In this 
respect, the homology of DRG1 1 to another PHD protein, Phox2, is of interest. 
5 Phox2 is expressed specifically in developing autonomic ganglia but not in 

trunk sensory ganglia (Valarche et al., 1993), a pattern which is strikingly 
complementary to that of DRG1 1 . Correlative data from in vivo and in vitro 
experiments, as well as DNA-binding data, suggest that Phox2 may be 
involved in the expression of neurotransmitter synthesizing enzymes such as 

1 0 DBH (Tissier-Seta et al., 1993). By analogy, DRG1 1 could play a role in 

specifying some aspect of the complex neurotransmitter phenotype of sensory 
neurons. Alternatively, if as suggested below, DRG1 1 function is important in 
appropriate synapse formation, its downstream targets could include cell 
surface proteins important in establishing or maintaining proper connectivity. 

15 In this respect it is of interest that Phox2 has also been shown to regulate the 
promoter of Ncam, a cell surface adhesion molecule (Tissier-Seta et al., 1993; 
Valarche et al., 1993). 

An intruiging feature of DRG1 1 expression is that it is detected both in sensory 
neurons and in a subset of their target neurons in the spinal cord, specifically 

2 0 those in the dorsal horn. A subset of these central neurons receive input from 
the NGF-dependent population of sensory neurons, which (as shown by 
double-labeling with antibodies to trkA) also expresses DRG1 1 . This suggests 
that DRG1 1 might function in regulating some aspect of synapse formation 
between sensory neurons and their central targets. However, DRG1 1 

2 5 expression within sensory ganglia was more extensive than that of trkA; for 

example it was detected in large neurons which typically project to the ventral 
spinal cord (Snider, 1994). This suggests that DRG1 1 expression is not 
restricted to NGF-dependent sensory neurons. This indicates that DRG1 1 
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expression cannot be sufficient to specify the central connectivity of such 
neurons. However it could be necessary for some aspect of this process. This 
notion is strengthened by the fact that DRG1 1 first appears in the dorsal spinal 
cord at El 5.5, about the time at which the first sensory afferents are growing 
5 into this region of the CNS. Studies in the chick indicate that some aspects of 
dorsal horn development depend upon sensory axon ingrowth (Sharma et al., 
1994). 
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CLAIMS 



•ecombinant nucleic acid encoding a DRG1 1 protein. 



recorhbinant nucleic acid according to claim 1 encoding the amino acid 

, \ io<nT£>:i) 
sequence depicted in Figure 2. 



udeicacid according to claim 1 which will hybridize to the 
nucleic acid depicted in Figure i 




4. .^recombinant nucleic acid according to claim 1 comprising the nucleic 
acid depicted in Figure 2^ 



5. An expression ve'stor comprising transcriptional and translational regulatory 
DNA operably linked to lSNA encoding a DRG1 1 protein. 



A host cell transformed with an expression vector according to claim^S^ 



7. A^meThod^fnroducing a DRG1 1 protein comprising: 

a) culturing a host-GeJTtransformed with an expressing vector 
comprising a nucleic acid enctrdinea DRG1 1 protein; and 

b) expressing said nucleic acid to producelTBRGJJjjrotein. 

. A recombinant DRG1 1 protein. 



9. A recombinant DRGJ-rproteirkaccording to claim 8 encoded by a nucleic 
acid which hybridizes to the nucleic acid sequence shown in Figure 2. 




10. X recombinant DRG1 1 protein according to claim 8 which is at least 
about »Q% homologous to the amino acid sequence shown in Figure 3. 



1 1 . A recombinant DRG1 1 protein according to claim 8 which has the amino 
acid sequence^shown in Figure37 



12. An antibody capable of specifically binding to a DRG1 1 protein. 



1 3 . A method for detecting a DRG1 1 protein in a target sample comprising 
contacting an antibody according to claim 12 with said target sample and 
assaying for the presence of binding between said polypeptide and DRG1 1, if 
present, in said target s 




14. A method for deterrjmiing the differential expression of a gene in different 
cell types or tissues comprising: 

a) synthesizing libraries of nucleicvacids from a plurality of different 
cell types or tissues using a set of primers; 

b) subcloning a portion of the library obtained from a first of said 
different cell types or tissue to form a subclone library; 

c) separately contacting members of said subolone library with probes 
each of which comprise one of said libraries wherein said nucleic acids 
in said libraries are labeled and wherein said containing is under 
conditions which permit the hybridization of said labejed nucleic acids 
to complementary nucleic acids, if present, in said subclone library; and 

d) determining whether hybridization has occurred for eachyof said 
probes for members of said subclone library as an indication of the 
differential expression of a gene expressed by said first cell type\>r 
tissue. 



ABSTRACT 



The invention relates to novel paired homeodomain proteins, nucleic acids and 
antibodies, and to a novel method of differential reverse-transcriptase based 
polymerase chain reaction (dRT-PCR). 
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gcagagSg 


GCAGGGTTCC 


CGAGCCGCTC 


^ 

TCCCGGCTCC 


CTGCTCTGGG 


51 


CCTTGGGGCT 


CCACCGGCTT 


CTTGGCCCGA 


GCTGCTGCGC 


GTGCAGATGG 


101 


CCTTGCGCGA 


TCGCCGGACC 


CCGCTGCGGT 


GGCCAAGTGC 


AGGGCTTGTG 


151 


GCTGGGACCC 


CTGAGAACCA 


GGAGCCAGAC 


TGTGCTCAGC 


TTGCCAGGCC 


201 


GGAGCCACGC 


ACGGGCACAA 


GTCTGTCAGG 


CCGCCATCAG 


TCCTGGTCCA 


251 


GCCGTCAGGG 


CCCATCCGAC 


CGTCGGCGAT 


GTTTTATTTC 


CACTGCCCGC 


301 


CACAGCTAGA 


GGGCACAGCG 


CCTTTTGGTA 


ACCACTCTAC 


GGGGGATTTT 


351 


GATGATGGGT 


TTCTTAGAAG 


AAAACAGCGC 


AGAAATCGGA 


CAACCTTCGC 


401 


TCTTCAGCAG 


TTGGAAGCTC 


TGGAGGCAGT 


CTTTGCCCAA 


ACACACTACC 


451 


CAGATGTCTT 


CACCAGAGAA 


GAGCTAGCCA 


TGAAAATAAA 


CCTCACAGAA 


501 


GCCAGAGTGC 


AGGTTTGGTT 


CCAGAACCGA 


AGAGCCAAGT 


GGAGGAAGAC 


551 


AGAGAGAGGG 


GCCTCTGACC 


AGGAACCAGG 


GGCTAAGGAA 


CCCATGGCAG 


601 


AGGTGACACC 


ACCCCCAGTG 


AGGAACATCA 


ACTCTCCACC 


CCCAGGGGAC 


651 


CAGGCCCGGG 


GCAAGAAGGA 


GGCCCTGGAG 


GCCCAGCAGA 


GCCTGGGACG 


701 


CACAGTGGGC 


CCCGCCGGGC 


CTTTCTTCCC 


CTCCTGCTTG 


CCAGGGACCC 


751 


TCCTGAACAC 


AGCCACTTAT 


GCCCAGGCCC 


TGTCCCATGT 


GGCATCTCTG 


801 


AAAGGGGGCC 


CACTGTGCTC 


TTGCTGCGTC 


CCAGACCCTA 


TGGGGCTCTC 


851 


CTTCCTCCCC 


ACTTACGGTT 


GCCAGAGTAA 


CCGCACAGCC 


AGCGTGGCTG 


901 


CCCTGCGCAT 


GAAGGCCCGC 


GAGCATTCAG 


AAGCGGTCCT 


GCAGTCTGCC 


951 


AACCTTCTGC 


CGTCCACCAG 


CAGCAGCCCC 


GGCCCTGCCT 


CCAAGCAGGT 


1001 


GCCTCCAGAA 


GGCAGCCAGG 


ACAAGCCCTC 


CCCAACGAAG 


GAACAGAGCG 


1051 


AGGGAGAGAA 


GAGCGTATGA 


GGGTCCGGAG 


AACCCAGCTG 


GGAGCCCTGC 


1101 


CCACCCCTGC 


TTCTCTCAGC 


CTCAGCCCTG 


CCAGCCTCTG 


AACCACAAGG 


1151 


AGTAGCCACC 


TCCTCATGGA 


TCTGACAGGG 


CAAACGGGAC 


CTGCAAGCTG 


1201 


GTTGAGACCT 


GAAGAGTCCC 


TCTAGAATTC 


TGCTGGTAGG 


CTGTGTTGTT 


1251 


CTCGCTTTTC 


CTTTGGTGAC 


ATTTTCCGAT 


GGCTCTTAGT 


GACTCTGGAC 


1301 


ACTGCTCTGT 


GATGAGGTCC 


CTGTTTTTTG 


CTTTTTGTTT 


TGTCTCTTTT 


1351 


TTTTTGTTTT 


GTTTTGTTTT 


ATTTTCCAGG 


CCAAGCAGCC 


TTGGAGCAAA 


1401 


GCAGATTAGT 


TTATTCCACC 


ATCCTTCTTG 


AGATATCTGG 


GAAGGTCTTG 


1451 


TCAATTCCAA 


GGACTGTGGC 


AAGGATCATC 


CGTGAAAGAT 


GCCAAGAAGT 


1501 


GACATCTCAT 


GACAGGAAAT 


GAGACGGGCA 


CTCCCATATT 


GCTTAAGAAC 


1551 


CACAGAACTG 


GTGGACTATC 


AGCCAGTTCT 


CACTCCCTTC 


AGCCAGGACT 





1601 


GGCATCGGCC 


TCCTTTGTCT TGTTTAAAGG AATTAGCTGA 


GGTTTTGGCT 




1651 


AGGAAGTGAC 


AAGATATGGG CTGAAGACAT TGTGGTCCTG 
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GGGATGGGGG 


CTGCTCTGCT GATTCTGTGT GTGGGTTGCC 


TGCAATTAGA 
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GCAGGCCCCG CTCTCTTCAG AAGAGTGATG 
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r> A APTV'AOA AfP 

GAA1 LAGAA1 
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Gl AGG1 1 1G1 AGCLCAGGAA AGGACLAGAG 


iCLl 1GAAGC 
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1901 


GGTAGGAAAT 


CCCTAGGAAG GCCCCTTAAA TACTTATGCC 


CAGATGAGCT 




1951 


GCCCTTCTTC 


CTATCCCCGT ATGTCGAGAG GTTGACGAGA 


CAGGAAAGCC 




2001 


AGGAAGATGA 


CTCCGTGTGG CAGAAGAGAA TGGAGTCCAA 


AGGGCCAACT 




2051 


TTCACAGAGA 


TTTCTGCCGC AGTTTAGCGT GGCTGTGTTC 


TTTCACGCGA 




2101 


TGGTGACTTC 


GGAGAGATCA GAGGGAGATG TGCAATAGCA 


TGAGCCCCGC 




2151 


TCCTGGCCCG 


GGTCCTGGAA AGGTTGTGGT TGTTTGGTGG 


CTTTGGCTGA 




2201 


TGATGTTTCC 


ACGCAAACAG ATATTGCTTT CATGATGGCT 


GTTCTCATTT 




2251 


CAGTTCTGAT 


AATCGAGACG CTGTGCTCCC AGGCGCTCTG 


CCTCCCCTTA 




2301 


ACTCTTCAGG 


AGCACCCCCT CCCCTGTAAT ACTCCTAAGT 


GTATCGTGCC 




2351 


TCACTTACGG 


TTACTGCAAC ACATTTGATG GAACACACTG 


TCTCCTTTAA 




2401 


AAAAGAAAAA 


AAAAAAAAAA AAAA 
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